AITopics | structured representation

Collaborating Authors

structured representation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Hierarchical Semantic Image Manipulation through Structured Representations

Neural Information Processing SystemsNov-20-2025, 22:18:04 GMT

Understanding, reasoning, and manipulating semantic concepts of images have been a fundamental research problem for decades. Previous work mainly focused on direct manipulation of natural image manifold through color strokes, key-points, textures, and holes-to-fill. In this work, we present a novel hierarchical framework for semantic image manipulation. Key to our hierarchical framework is that we employ structured semantic layout as our intermediate representations for manipulation. Initialized with coarse-level bounding boxes, our layout generator first creates pixel-wise semantic layout capturing the object shape, object-object interactions, and object-scene relations. Then our image generator fills in the pixel-level textures guided by the semantic layout. Such framework allows a user to manipulate images at object-level by adding, removing, and moving one bounding box at a time. Experimental evaluations demonstrate the advantages of the hierarchical manipulation framework over existing image generation and context hole-filing models, both qualitatively and quantitatively. Benefits of the hierarchical framework are further demonstrated in applications such as semantic object manipulation, interactive image editing, and data-driven image manipulation.

learning hierarchical semantic image manipulation, name change, structured representation, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.78)

Add feedback

WebGen-V Bench: Structured Representation for Enhancing Visual Design in LLM-based Web Generation and Evaluation

Wang, Kuang-Da, Wang, Zhao, Shimose, Yotaro, Wang, Wei-Yao, Takamatsu, Shingo

arXiv.org Artificial IntelligenceOct-20-2025

Witnessed by the recent advancements on leveraging LLM for coding and multimodal understanding, we present WebGen-V, a new benchmark and framework for instruction-to-HTML generation that enhances both data quality and evaluation granularity. WebGen-V contributes three key innovations: (1) an unbounded and extensible agentic crawling framework that continuously collects real-world webpages and can leveraged to augment existing benchmarks; (2) a structured, section-wise data representation that integrates metadata, localized UI screenshots, and JSON-formatted text and image assets, explicit alignment between content, layout, and visual components for detailed multimodal supervision; and (3) a section-level multimodal evaluation protocol aligning text, layout, and visuals for high-granularity assessment. Experiments with state-of-the-art LLMs and ablation studies validate the effectiveness of our structured data and section-wise evaluation, as well as the contribution of each component. To the best of our knowledge, WebGen-V is the first work to enable high-granularity agentic crawling and evaluation for instruction-to-HTML generation, providing a unified pipeline from real-world data acquisition and webpage generation to structured multimodal assessment.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.15306

Country:

Asia > Middle East > UAE (0.28)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Transportation (1.00)
Health & Medicine > Consumer Health (1.00)
Consumer Products & Services > Travel (1.00)
(12 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

SR-LLM: Rethinking the Structured Representation in Large Language Model

Zhang, Jiahuan, Wang, Tianheng, Wu, Hanqing, Huang, Ziyi, Wu, Yulong, Chen, Dongbai, Song, Linfeng, Zhang, Yue, Rao, Guozheng, Yu, Kaicheng

arXiv.org Artificial IntelligenceFeb-20-2025

Structured representations, exemplified by Abstract Meaning Representation (AMR), have long been pivotal in computational linguistics. However, their role remains ambiguous in the Large Language Models (LLMs) era. Initial attempts to integrate structured representation into LLMs via a zero-shot setting yielded inferior performance. We hypothesize that such a decline stems from the structure information being passed into LLMs in a code format unfamiliar to LLMs' training corpora. Consequently, we propose SR-LLM, an innovative framework with two settings to explore a superior way of integrating structured representation with LLMs from training-free and training-dependent perspectives. The former integrates structural information through natural language descriptions in LLM prompts, whereas its counterpart augments the model's inference capability through fine-tuning on linguistically described structured representations. Performance improvements were observed in widely downstream datasets, with particularly notable gains of 3.17% and 12.38% in PAWS. To the best of our knowledge, this work represents the pioneering demonstration that leveraging structural representations can substantially enhance LLMs' inference capability. We hope that our work sheds light and encourages future research to enhance the reasoning and interoperability of LLMs by structure data.

arxiv preprint arxiv, dataset, representation, (12 more...)

arXiv.org Artificial Intelligence

2502.14352

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Reviews: Learning Hierarchical Semantic Image Manipulation through Structured Representations

Neural Information Processing SystemsOct-7-2024, 11:12:50 GMT

In this paper a new method for image manipulation is proposed. The proposed method incorporates a hierarchical framework and provides both interactive and automatic semantic object-level image manipulation. In the interactive manipulation setting, the user can select a bounding box where image editing for adding and removing objects will be applied. The proposed network architecture consists of a foreground output stream which produces the predictions on binary object mask and a background output stream for producing per-pixel label maps. As the result, the proposed image manipulation method generates output image by filling in the pixel-level textures guided by the semantic layout.

evaluation, learning hierarchical semantic image manipulation, structured representation, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.39)

Add feedback

Keeping the Questions Conversational: Using Structured Representations to Resolve Dependency in Conversational Question Answering

Zaib, Munazza, Sheng, Quan Z., Zhang, Wei Emma, Mahmood, Adnan

arXiv.org Artificial IntelligenceApr-14-2023

Having an intelligent dialogue agent that can engage in conversational question answering (ConvQA) is now no longer limited to Sci-Fi movies only and has, in fact, turned into a reality. These intelligent agents are required to understand and correctly interpret the sequential turns provided as the context of the given question. However, these sequential questions are sometimes left implicit and thus require the resolution of some natural language phenomena such as anaphora and ellipsis. The task of question rewriting has the potential to address the challenges of resolving dependencies amongst the contextual turns by transforming them into intent-explicit questions. Nonetheless, the solution of rewriting the implicit questions comes with some potential challenges such as resulting in verbose questions and taking conversational aspect out of the scenario by generating self-contained questions. In this paper, we propose a novel framework, CONVSR (CONVQA using Structured Representations) for capturing and generating intermediate representations as conversational cues to enhance the capability of the QA model to better interpret the incomplete questions. We also deliberate how the strengths of this task could be leveraged in a bid to design more engaging and eloquent conversational agents. We test our model on the QuAC and CANARD datasets and illustrate by experimental results that our proposed framework achieves a better F1 score than the standard question rewriting model.

artificial intelligence, natural language, question answering, (17 more...)

arXiv.org Artificial Intelligence

2304.07125

Country: